Random quantum codes from Gaussian ensembles and an uncertainty relation 
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Using random Gaussian vectors and an information-uncertainty relation, we give a proof that the 
coherent information is an achievable rate for entanglement transmission through a noisy quantum 
channel. The codes are random subspaces selected according to the Haar measure, but distorted 
as a function of the sender's input density operator. Using large deviations techniques, we show 
that classical data transmitted in either of two Fourier-conjugate bases for the coding subspace can 
be decoded with low probability of error. A recently discovered information-uncertainty relation 
then implies that the quantum mutual information for entanglement encoded into the subspace 
and transmitted through the channel will be high. The monogamy of quantum correlations finally 
implies that the environment of the channel cannot be significantly coupled to the entanglement, 
and concluding, which ensures the existence of a decoding by the receiver. 



I. PROBLEM AND BACKGROUND 

For a bipartite quantum state p AB , the coherent infor- 
mation is denned to be 

I(A)B) p = H(p B )-H{p AB ), 

where H denotes the von Neumann entropy. Sometimes, 
if the state is clear from context, we omit the subscript 
and simply write H(A), I(A)B), etc. By way of nota- 
tion, we adopt the habit of writing the (Hilbcrt space) 
dimension of A as \A\. 

The hashing inequality 0] is the statement that asymp- 
totically many copies of p have a yield of I(A)B) ebits 
per copy under entanglement distillation procedures with 
only local operations and one-way classical communica- 
tion from Alice to Bob. 

Closely related, for a quantum channel (i.e. a com- 
pletely positive, trace preserving - cptp - map on density 
operators) 

N : B(A') — > B(B) 

and a reference state p A on A', we can define the coher- 
ent information I c (p;Af) of the channel with respect to 
p as follows: Consider a purification \(j)) AA of p A , and 
letting uj ab := (id <g> A/") | </>}(<£!, define 

I c {p;Af)=I(A)B) UJ . 

Introducing an isometric Stincspring dilation 

V : A' ^ B ® E, 
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for A/" mapping the input Hilbert space A into the com- 
bined output and environment spaces, we can re-express 
this quantity as follows: introduce the three-party state 



, ABE 



,AA' 



{t®V)\(j}) J 

which is a purification of uo AB . Then 

I c {p-N)=H(B). 4 ,-H{E) 4 ,. 

Finally, we need the concept of quantum code: for a 
channel Af ■ B(A') — > B(B), this is given by a pair of 
cptp encoding and decoding maps 

S :B(C N ) ^B(A'), 
V :B{B) -> B(C N ). 

The important parameters of a code are the dimension N 
of the encoded system, and the error, given by the trace 
distance 
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where = k \jj)(kk\ is the maximally entangled 

state on <g) C^. For more on the history of these con- 
cepts, motivation, etc., we refer the reader to the com- 
panion papers [I4] and see also (20j . 

The main results we are going to prove are the follow- 
ing two: 

Theorem 1 Let Af : B(A') — ► B(B) be a quantum chan- 
nel with Stinespring dilation V : A' <—* BE, p an in- 
put density operator, and P B , P E projections in B, 
E, respectively, with the following properties (for some 
1/3 > e > and D,A> 0): 

Ti{{VpV r ){P B ® P E )) >l-c, 

P B Kf{p)P B < d- 1 p b , 
p< A' 1 !. 
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Then, for < T] < 1, there exists a quantum code with 
encoded dimension 



N < min < rj 



D 



rankP s 



and error P|„, < 2^2P 2 (2A) + 4AlogA, where P 2 (a;) = 
— ilogx — (1 — a;) log(l — a:) is the binary entropy, and 

A = 9Ve + 707 + 37V exp(- Ae 2 /4). 

Assuming N > 2, one obtains the simplified error bound 

Pl rr < 7v/loilv^A. 

A particular case is that of a memoryless channel 
J\f = A/'®". Wc call Q an achievable quantum rate for 
TV if there exists a sequence of codes (£„, 2? n ) with input 
dimensions A n and error P e 9 rr — > as n — » oo, such that 



lim inf — log AT n > Q . 



Theorem 2 (Lloyd [2l|, Shor HH and DevetakQ) 

Consider a quantum channel J\f : 13(A) — > B(B), and 
an input state p on A' . Then, the coherent information 
I c (p,Af) is an achievable quantum rate. 

In fact, using the concept of typical subspace, the sec- 
ond theorem follows easily from the first. We will prove 
Theorem [1] in section ITVl after introducing Gaussian ran- 
dom vectors in section UH and describing the random 
codes we are going to look at in section iHll The great 
conceptual significance of Theorem [2] is that it makes it 
possible to express the quantum capacity of J\f, i.e. the 
largest achievable rate, in terms of the coherent informa- 
tion; thanks to a matching upper bound by Schumacher 
and Nielsen [25[ , the capacity is thus given by 

Q{N) = km -max/ c (p^;AA®"). 

n— >oo n p(") 

Deducing Theorem [2] from Theorem [1] is a straight- 
forward application of typical subspace techniques [24 1 - 
see appendix A: choose projectors Pf, P^ , Pf in A n , 
B n , E n , respectively, according to Lemma [TT1 (appendix 
A). Furthermore, let A = Ag be the support of P^, 



.pA p ®npA T hen the 



B = B n ,E = E n andp= w ^p i 

conditions of Theorem Q] are satisfied, with rankP B = 
2nH(E)+n8 jj _ 2 nH ( B )~ nS and A = 2 nH ( A ^~ nS for 

e = 2 • 2~ cnS and all sufficiently large n. Letting 
7 = 2- cnS \ we see that we may take N = 2™ 7 ( A > B )- 3 ™' 5 , 
and the get a code of encoded dimension N and with 
error exponentially small in n. In other words, the rate 
I(A)B) — 38 is achievable; since 5 > is arbitrary, The- 
orem [5] follows. □ 

The strategy we will use to prove Theorem Q] will be 
familiar from various Shannon-style proofs; we shall find 



a subspace of the input space by an appropriate random 
selection, However, the analysis of the code differs from 
the approaches of the companion papers [l2j and [H] . 

Both these and the present proof hinge on the demon- 
stration that the input and environment of the channel 
decouple when used with the appropriate code. Once 
this decoupling is established, the existence of a decod- 
ing/error correction procedure for the receiver follows by 
a standard argument. 

So, all three proofs proceed via decoupling of the chan- 
nel environment or, equivalently, by forcing the quantum 
mutual information between input and environment to 
be (close to) zero. This is shown by direct calculation 
in [T3|. In [l||, following 0, one first shows that the 
code subspace has a basis such that the receiver can suc- 
cessfully measure-decode the basis state while the envi- 
ronment learns (almost) nothing about it - after which 
one "makes the decoding coherent" . Here, it is done by 
not involving the environment at all: instead, we show 
that both a special orthonormal basis of the subspace as 
well as the Fourier conjugate basis can be decoded at the 
output. This means that the Holevo quantities of the two 
state ensembles, basis and Fourier-conjugate, are close to 
maximal, implying, via a recent information-uncertainty 
relation, that the quantum mutual information down the 
channel is close to maximal. This finally yields the con- 
clusion that the crucial mutual information between the 
input and the environment is close to zero. 

We think that this analysis is closest (among the three 
proofs collected in this issue) to the original idea in [2(| . 
It is still not the same, as there an explicit description of a 
quantum decoder is given, without recourse to decoupling 
the input from the environment. See however the recent 
paper [l9j for an alternative argument. 

The rest of the paper is organised as follows: in sec- 
tion [IT] wc introduce the notion of Gaussian distributed 
random vectors ("Gaussian vectors" for short) and re- 
view some of their properties, mostly cited from Q, ex- 
cept for a tail bound on the quantum expectation of ran- 
dom states with an arbitrary observable. Then, in sec- 
tion IIII1 we define the quantum codes which we show 
to be good quantum transmission codes achieving the 
bound of Thcorcm[T]in scction llVl Two appendices serve 
to collect various auxiliary results about states, measure- 
ments, and typical subspaces used throughout the paper, 
in addition to miscellaneous proofs. 



II. GAUSSIAN VECTORS 

We take the following definitions in abridged form from 
appendix A of [1] ; the interested reader is encouraged to 
consult the referenced paper. 

A Gaussian complex number with mean and variance 
a 2 > is a random variable X + iY, where X and Y are 

independent real random variables with X ~ A^O, ^-J 
and Y ~ n(o, ^ . Its distribution is denoted A c (0, a 2 ). 
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For any orthonormal basis {|1}, . . . , \D}} of C D , a 
Gaussian vector is defined to be a random variable 
\g) G C D whose distribution is described as follows: 



D 



with N independent Gaussian complex numbers 
c\, . . . , Cd ~ Nc(0, ft is a fundamental property 
of the above sum that the resulting distribution is in- 
dependent of the basis chosen. I.e., the distribution is 
unitarily invariant, and in particular, its density depends 
only on the length ||| s )|| 2 = \J(g\g) = yfY.1 M 2 - In " 
deed, we defined the Gaussian vectors in just such a way 
that E(g|<?) = 1. And according to Lemma [3] below the 
distribution is strongly concentrated around this value. 

Lemma 3 Let \g) and \gi), ■ ■ ■ ,\gx) be independent 
Gaussian vectors in C D . Then, for < e < 1, 

Pr{|Tr| 3 )(. 9 |-l| >e} <2cxp(-e 2 d/6), 

and, for a projector P of rank r, 

K 



Pr 



k=l 



rK 
~~D 



> e- 



rK 
~D 



< 2 exp — r K 



Furthermore, for e < 1/3, and < A < 1 an operator, 



Pr|Tr| 5 )( ff | J 4>(l + e) 
Pr(Tr| 5 )( ff | J 4<(l-e) 



TtA 
~d~ 
TtA 



}<- 


exp | 


' e 2 
v 4 




\<- 


exp | 


' e 2 
v 4 


TrAj 



(1) 

(2) 



Proof. The first and second statement, about the lengths 
of Gaussian vectors and average inner products, is from 
Lemma 3 in Q - see also appendix A there - or Lemma 
II.3 in [H]. 

The third is a generalisation of Lemma 3 in [4J (Lemma 
II. 3 in [TH). It is proved in appendix B. □ 



III. RANDOM SUBSPACE PROJECTORS 

For an input space A of dimension \A\, and reference 
state p, the code will be chosen as follows: pick a sub- 
space So of dimension N according to the Haar measure, 
denoting its corresponding subspace projector Pq. Then, 
let S = y/pSo, so its subspace projection P projects onto 
supp %/pPo \fp, the support of the projector y/pP^y/p; 
this will be our random code for Theorem [T] 

Our preferred way of describing this random selection 
is via a spanning set of vectors drawn independently as 
follows. For j = 1, . . . , N, let \gj) be i.i.d. Gaussian vec- 
tors in A. With probability one, these are linearly in- 
dependent, so they span an TV-dimensional subspace <5o, 



which, by the unitary invariancc of the Gaussian mea- 
sure, is itself distributed according to the unitarily in- 
variant measure. Now let 



\A\p\ gj ). 



These vectors will turn out to be almost normalised, with 
high probability. They clearly span S = y/pSo, but we 
are after more; we need an orthogonal basis of S. To 
get this, we follow the recipe of the "square root" or 
"pretty good" measurement: with the (random) operator 
r := J2f=i ITjXTjI, we finally define 

l&> :=r- 1/2 |7i>, 

which is an orthogonal basis of S (if the are lin- 
early independent) because the subspace projector is 

As outlined in the introduction, we will aim to show 
that this basis, sent through the channel with equal 
probabilities, will yield an output ensemble of states 
Oj = N{4>j) with Holevo information close to log TV. In 
fact, we have to show this for the basis {|</>j)} as well as 
for its Fourier-conjugate basis consisting of the vectors 



On the face of it, this set of vectors could have a pecu- 
liar, perhaps hard to describe, distribution. This is not 
at all the case thanks to the particular properties of the 
Gaussian distribution and the Fourier transform. 

Definition 4 We call a family . . . , I^at)} of vec- 

tors formally Fourier-conjugate to the family of vectors 
{\v\), \v N )}, if for all k, 



ijk/N I 



Note that we do not demand normalisation or orthogo- 
nality of the vectors in either family. Also, the dimension 
D of the space may be different from N . 

Lemma 5 If the family {\w±), . . . ,\wn)} of vec- 
tors is the formal Fourier- conjugate of the family 
{\v%), . . . , \v]sr)}, then for all j, 



-y 



Furthermore, 



\ v j)i v j \ = \wk){w k \ 



Finally, if {|i>i}, . . . , are independent Gauss 

vectors with N < D, then so are . . . , |ifjv)}- 
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Proof. Straightforward calculations. 



□ 



This means that there is another, equivalent, way of 
arriving at the basis {|0fc)} of S: namely, start with the 
set of (by Lemma [SJ Gaussian!) vectors 



formally Fourier-conjugate to the \gj). Then we can form 



the vectors |7fc) = y \A\p\gf.) , and they are clearly for- 
mally Fourier-conjugate to the \ Finally, by Lcmma[5] 

above, the normalisation operator T = J^ fc 1 7^X7*: I equals 
r, so we find that 

$ k ) = r- 1 / 2 \%) = r- 1 / 2 \%). 

In other words, we have arrive at the 

Proposition 6 The distribution of the set {\4>k)}k is ex- 
actly the same as that of the set {\<j>j)}j- □ 

IV. PERFORMANCE ANALYSIS 

In the previous section we have described a random 
subspacc S of A'. The encoder of the code will simply 
be the isometric identification of C N with S: £ = U -U\ 
with 



U : 



iN 



li) 



S <-> A', 



Following Devetak Q - see Lemma 1.1 in [ljj - we 
do not worry about the decoding map; it will exist once 
the "decoupling from the environment" condition holds. 
Namely, denoting R = C , t r the maximally mixed 
state on R, and 



If 



RBE 



{±®VU)\$ N ) 



we know that a decoder T> with error p exists once we 
ascertain that 



RE 



<P, 



for an arbitrary state d E of the environment. 

By Pinsker's inequality [23[ for the relative entropy, 
applied to $> RE and t r <g> 



I{R:E) = D[^ re \\t r ®t> E 

yRE _ t r. g #E 



> 



so it is enough to show I(R : E) < p 2 /4. Here, 
I(R : E) = H(R) + H(E) - H(RE) is the quantum 



mutual information, and D(p\\a) 
the quantum relative entropy. 
By the elementary identity 



Tr p(log p — log a) is 



2H(R) = I(R : E)+I(R : B), 

which holds for any pure state on RBE, and with 
H{R) = logiV in our case, we will be done as soon as 
we show I(R : B) > 2 log N - p 2 /4. The proof that this 
inequality holds for a random subspace is based on the 
following "information-uncertainty relation" : 

Lemma 7 (Information-uncertainty (HI], Lemma 1) 

Let £q = {1/-/V, |iXil} ^ e ^ e uniform ensemble for an ar- 
bitrary fixed orthonormal basis {\j}} of an N -dimensional 
Hilbert space S, and £ 1 = {1/N, QFT|j)(7'|QFT t }, where 
QFT is the Fourier transform in dimension N . 

Then, for any quantum channel M. with input space S 
and output B, 

x{M{£ a ))+ X {M{ei))<I{R:B) u . 

Here, the right hand side is the quantum mutual infor- 
mation of the state oj rb = (id <E> M)$d, where $>d is the 
maximally entangled state on RS . On the left hand side, 
we have two Holevo informations IHl] of the ensembles 
M.(£i) of channel output states; for an arbitrary ensem- 
ble £ = {p x , a x } of states, 

X(£) ■= H I ^2p x a x J - Y^PxH(a x ). 

\ X / X 

□ 

Of course, the assumption of this lemma is just our 
situation: we have a subspacc S of dimension ./V in A, 
and consider two Fourier-conjugate bases. 

Hence, in the light of Proposition [51 all we need to 
show is the following: 

Proposition 8 Under the assumptions of Theorem [7J 
consider independent Gaussian vectors \gi), . ■ ■ , |<?jv) G 
A' , as well as 



\ 7j ) :=y/\A\p\ 9j ), 

r:=5>iX7il, 
3 

\4>i) ■■=^- 1/2 \l3)- 

Then, for the output ensemble 

£ = [l/N,a 3 :=AA(|^)(^|)}, 

it holds with probability > 1/2 that 

X {£) > logiV - H 2 {2X) - 2AlogiV, 

where 

X = 9\/e + 70? + 37V cxp(-iVe 2 /6). 
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As a consequence, we have that with positive probabil- 
ity both £ and the ensemble obtained from the Fourier- 
conjugate inputs. 



£ = 



[l/N 7 N($A\)} 



have > logA - H 2 (2X) - 2AlogA. By 

Lemma [7] this means I(R : B) > 2\ogN - 2H 2 (2X) - 
4AlogA, hence I(R : E) < 2H 2 {2\) + 4Alo gA, and 
we are done. Observing that H 2 (x) < 2y/x(l — x), 
the right hand side can be further upper bounded by 
6\/A + 4AlogiV, which is < lOVAlog N as long as N > 2. 

To conclude, we use Pinsker's inequality, as described 
at the start of this section, to relate P e 9 rr and (the upper 
bounds on) the mutual information I(R : E). 

Proof of Proposition^ What we shall show is that there 
exists a classical decoder for the ensemble achieving small 
error probability; i.e. we need to find a POVM (Aj)jL 1 
such that 

p - -^E^M 1 -^)] 

3 

is small, at least in expectation. Then, denoting the 
random output of the measurement j' , we have that 
by the monotonicity of the Holevo quantity under post- 
processing and the classic Fano inequality Q, 

X(£) > 1(3 ■ f) > logiV - P c c rr log AT - H 2 (P°„). 

Looking at this, we are done once we show that 

EP e c rr < Ve + 7^¥j+ 3iVexp(-iVe 2 /6) =: A. 

The reason is Markov's inequality, telling us that the 
probability of a random random code having P e c rr > 2A 
is strictly smaller than 1/2. 

For this, we first analyse random codes drawn from 



the ensemble I7) = y\A\p\g), with Gaussian \g). The 

states 7 = 1 7X7 1 and so the u a := TV" (7) are of course not 
generally normalised, but we can still apply the Packing 
Lemma (Lemma [H] in appendix A). There, we let II = 
P B , and the II S for the individual ensemble states AT (7) 
are constructed as follows: observe that I7') := (1 <g> 
P E )V\j) G B <g> E is a vector of Schmidt rank at most 
d = rankP E , so we may choose n g to be the projector 
onto the support of Tr^ 17'Xt'I- The conditions of the 
Packing Lemma are easily verified - observing that E7 = 
p, so Ecr g = N{p) =: a. 

We conclude that for i.i.d. {I71), . . . , |7at)} there is a 
POVM {Ai, . . . , A N } such that 

EP: iI ({M( lj ),A j }f =1 )<6V~e + 4 V . 

Now, if we use the same decoder instead for the states 
\<f)j) = r _1 / 2 |7j), we incur additional errors, as follows: 



First of all, by Lemma [3] applied to A = Ap we have, 
except with probability < 2Acxp(— Ae 2 /4), that 

Y? l-e< ( 7j '|7i) <l + e. (3) 

which we shall assume to hold from now on. 

Furthermore, we have, using the elementary inequality 
7II1 < \/2||<^ — 7II2 for rank one projectors <fi and 7, 
and eq. ([3]), that 

1 v— » 1 .. 1 v-~* v2 1, 



N ^2 

j 



^E^V^i^+^-i^ 2 - 2 !^-^ 
7^E v / ( 1 + e ) 2 -i^^-)i 2 



<w^E(( 1+e ) 2 -i^i^)i 2 ) 
<^E 2 ( 1+c )(( 1+c )- 

(4) 

where the second-to-last line follows by the concavity 
of the square root function, and the last involves the 
Cauchy-Schwarz inequality. 

We shall concentrate for the moment on the average 
under the square root: 



l^((l + e )-|(^| 7i )|) =e+ I^(l-|(^.| 7j )|) 
3 3 

= ^ + ^E( 1 -^|r- 1 / 2 | 7 ,)) 

3 

= e+ l-i T rVf ) 

(5) 

where we have inserted the definition of the \4>j), and 
noted that the inner products (4>j\lj) are non-negative. 
Now, we use a trick from for the positive semidefi- 
nite operator T, 

y/f > -r-ir 2 , 
2 2 ' 

so we can continue upper bounding as follows, using the 
abbreviation Sjk = {lj\lk)'- 

N ~ N \2 2 J 

3 V 7 J#fc 

= ^E( 1 -^)( 1 -^) + ^Ei^i 2 - 



6 



Here, the first term is bounded above by e-^. The sec- 
ond term consists of an average of N expressions, one for 
each j, of the form 

k^j k^j 



(l 3 W\A\p\g k ) 



with a rank one projector Pj. So we can apply Lemma[3] 
once more to find that, except with probability < 
N exp(— Ne 2 /6), the latter expressions are all upper 
bounded by 

l + e U i + e < i + e V 
A \A\ 

Inserting all this into eq. ©, we find 

I ^ ((1 + e ) - | | 7j .) |) < e + e i±£ + (1 + e )\ 

3 

In turn plugging that into eq. ([¥]), we arrive at 

JjY\Ui- li 1 1 1 < ^2(1+6) ( e l±i + (l + e )2^ 

< V9e + 977 < 3v^ + 3V^, 

remembering e < 1/3. 

Putting all this together, with the monotonicity of the 
trace norm under cptp maps and using Tr((p — <r)A) < 
\ \\p — fill for states />, <r and < A < 1. leads to 



f=0 



< 



LP c c rr ({AA( 7j ),A,}f =] 



□ 



+ 3^ + 307 + 3A r exp(-iVe 2 /6) 
< 6^ + 4? ? + 3Vi + 30? + 3Arexp(-Afe 2 /6), 

and we are done. 

V. CONCLUSION 



We have given yet another proof of the direct part 
of the quantum channel coding theorem, in the sense 
of showing the achievability of the coherent information 
rate. 

The present proof is distinguished from other ap- 
proaches in that it is shown that the classical informa- 
tion in two Fourier-conjugate bases of the code subspace 
can be recovered at the output. Application of a re- 
cent information-uncertainty relation then ensures that 
the quantum information in the subspace can in fact be 
decoded. 



It is tempting to speculate that the role of the pair 
of measurement-decoders for the two conjugate bases is 
to implement the measurement of the familiar basis and 
phase errors of a conventional quantum error correcting 
code, or their equivalents. To give more substance to 
this idea, it would be necessary to show how to build the 
quantum decoder directly from the two measurement- 
decoders. We leave this as an open problem. 
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APPENDIX A: MISCELLANEOUS LEMMAS 



Lemma 9 (Packing [181]) Consider an ensemble 
{p m -,o~rii} of positive semidefinite operators (not nec- 
essarily states!) with average <r — ^2 m p m o~m, which 
is assumed to be a density operator; in particular, 
J] m j) m Tr a m = 1. Assume the existence of projectors IL 
and n m with the following properties: 

^Pm Tr(7 TO n ro > 1 - e, 

I) I 

^ Pm Tr cr m n > 1 - e, 

m 

Tr n in < d, 
ttatt < D^U, 

for all m. Let N = \r)D/d\ for some < r] < 1, and 
pick mi, totv independently at random according to 
the distribution p m . 

Then there exists a corresponding POVM {A^}^ =1 
which reliably distinguishes between the states {o~m k }k=i 
in the sense that the expectation of the (average) error 
probability of the code {o~ mk , Ak}^ =1 , 

Perr = P^r(Wm h > Afe}) := ^ £ Tr [<J m . k {1 - Afc)] , 
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satisfies 



4,/e + 4n < 6Ve + 477. 



(7n particular, there exists a code with error bounded by 
the above quantity.) 

The same statements hold for continuous ensembles - 
the above formulation with a discrete probability distribu- 
tion was chosen only for notational convenience. □ 

Proof. It is almost the same statement and proof as 
Lemma 2 in [la] , which itself is an adaptation of a re- 
sult by Hayashi and Nagaoka [Tll |. 

Note that we demand state normalisation of the o~ m not 
individually, but only in the ensemble average - which 
makes the lemma more suitable to be applied with the, 
generally unnormalised, Gaussian input states. Inspect- 
ing the proof in [l8[ , it is evident that in fact only that 
is required. 

There are only the following two other differences. We 
use the slightly better "Gentle measurement Lemma" of 
Ogawa and Nagaoka (53 instead of [27| - see Lemma ITD1 
below. And whereas [181 ] demands that for all m, 

TrfT m n m , Trer m n > 1 - e, 

our conditions on II and the II m require this to hold only 
on average over the ensemble {p m , a m :m£M}, Look- 
ing at the proof in [l8|], it is evident that this condition 
is indeed enough for the conclusion. □ 



Lemma 10 (Gentle measurement [271 ] and [22]) 

Let p be positive semidefinite, and < X < 1 
be an operator on some Hilbert space, such that 
Tt( P (1 - X)) < eTrp. Then, 

\\p- x^XpVx^ < 2^Trp. 

a 

Here follow some properties of typical subspaces as 
defined in [24|; we quote directly from [ljj. Consider 
a density matrix with spectral decomposition p A = 
^2 x p x \x)(x\ A . Its nth tensor power can be written as 



(P 



5>*»is n x* n i An . 



n\A n 



where p x ™ = p Xl ■ ■ -p x n and \x 
The 8- (entropy) typical subspacc As < A n is defined as 



As = span i \ x r ' 



- lOgfen - H{ P A ) 



< 5 



and the 8 -typical projection P A is defined to project A n 
onto As- We shall need the following lemma: 

Lemma 11 (Typicality) Let a tripartite pure state 
\i[>) ABC be given. For every 5 > and all sufficiently 
large n there are S -typical projections P A , P^ and Pf 



onto 5 -typical subspaces As C A n , P>s ^= B n and Es C 
E n , respectively, such that the states 



ABE\C$n 



I 'A 



satisfy 

\A S \ < 2 nH(A)+nS , 
\B S \ < 2 nH ^ +nS , 
\E S \< 2 nH ^ +nS , 

pB^B n pB < 2-nH(B)+nSpB 

||^" Bns "-Vf BnBn ||i<e, 

where e — 2~ cnS for some constant c > independent of 
5 and n. 



Proof. See [Uj. 

APPENDIX B: PROOF OF 
LEMMA H EQS. (UJ) AND ^ 

We shall use the following easy lemma: 
Lemma 12 Let S < 1. Then: 



□ 



for - S < x < 0, ln(l + x) > x - 



x 1 1 



for 0<x<l, ln(l + a;) >x- —. 

2 3 

Proof. By Taylor expansion, ln(l + x) ~ x ~ ^ + 3 

4 -r 5 -1- — 

The second bound is the easier one: just group each 
(positive) odd term with its immediately consecutive 
(negative) even term, i.e. 

■, etc., 



3 4 ' 5 6 



all of which are clearly non-negative, and we are done. 
For the first bound, write y = —x < 5, and observe 



ln(l + x) = ln(l — y) = —y 



? 3 

y__y_ 
2 3 
v 2 ( 2 2 , 

-v - — 1 + -V + -V + 
2 \ 3 4 



> -v - y (1 + y + y 2 + y a + ■ ■ •) 

x 2 1 x 2 1 

= x > x — 



2 1-1/ 



2 1-6 



□ 



Proof of the probability bounds ([7j) and (0). Write A in 
its eigenbasis, A = with < a, < 1. The 



8 



Gaussian vector is \g) = Ci\i), with c, ~ jVc(0, 1/-D). 
Then Tr(|5)(g|A) = X^ a il c i| 2 i s a weighted sum of in- 
dependent random variables - which is where the large 
deviation behaviour will come from. 

The "Bernstein trick" is the realisation that (for t > 0) 

Prj]Ta. t N 2 >(l + e )^j 
= Pr{e*£* ai 



> e t(l+e)(T¥A)/r>j 



,-i(X+e)oi/D 



the second line by Markov's inequality, and the third by 
independence of the c 2 ;. We take the evaluation of the 
expectation above (known as "moment generating func- 
tion") from Lemma 23 (appendix A): for t < D/ai, 

Ee* a d c d 2 = 



i _ +s± 
1 i D 



Plugging this in and letting t = Dj^, we get the upper 
bound on the probability in question, of 



n e -«i-i»(i-S). 



The exponents can be upper bounded using Lemma 1121 
because we assume e < 1/3, the argument is bounded 
above by (l/3)/(l + 1/3) = 1/4, so we get 



In 1 - 



< —ecu 



1 1 



1 + e 21-1/4 \l + e 
2 e 2 a? 



< 



1 + e 3(a + e) 2 

1 2 1 



So, we finally get that the probability in JT]) is upper 
bounded by 



n 



which is what we wanted. 



The bound in the other direction is fairly similar: here 
we have, for t > 0, and pretty much as before (noting 
that the extra minus sign reverses the direction of the 
inequality) , 



Prjj>|«| a < (l-e)^J 

= p r j Oilci | 2 > e _t(l-e)(TrA)/D| 
< (£ e -tT, t ^\c z \ 2 ^ e t(l-e)(TrA)/D 

= Y[(Ee- ta >^e t ( 1 -^ D 

i 

= TT 1 r t(l-e)ai/D 
= TT e t(l-e)o i /r>-ln(l+i^) i 



Now, choosing t = Dj^, the exponent for each i is 



eai — In 1 



1 - e 



< ecii 



ca. 



1 I ea, 



1-e 2 \1 -e 

;,: 1 eV 



< -e*Oi 1 



1-e 2 (1-e) 2 
1 



I 2 

2(1-1/3),/" 4 e £ 



where we have once more invoked Lemma [12] and used 
e<l/3. □ 
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